Welcome to Practical 1, an introduction to using R and JAGS. In this practical we’ll
You should follow and run the commands shown in the grey boxes. At various points you will see a horizontal line in the text which indicates a question you should try to answer, like this:
Exercise X
What words does the following command print to the console?
print("Hello World")
If you are a confident R user, you might want to skip some questions: that’s fine, we’d you to get as much as possible out of the course, rather than completing every last thing.
If you get stuck, please get our attention and we will try to help! There aren’t prepared answers to all of these questions so keep you own record as you go. At the end of the practical are harder questions which you can attempt if you get through all of the material. If you find any mistakes in the document please let us know.
You can run the code from these practicals by loading up the .Rmd file in the same directory in Rstudio. This is an R markdown document containing all the text. Feel free to add in your own answers, or edit the text to give yourself extra notes. You can also run the code directly by highlighting the relevant code and clicking Run.
R is an open-source object oriented language for doing statistics. All the data you load in, the internal functions and any functions you write are objects with attributes. We’ll get a feel for what this means as we go along.
If you need help with a function, you can type ?functionname into the command line, for example ?print
QuickR is a really nice reference website for R, and StackOverflow has loads of code for more difficult queries.
Exercise 1
Use the read.csv() function [a variant of read.table()] to load up the data in file 'https://raw.githubusercontent.com/andrewcparnell/tsme_course/master/data/hadcrut.csv'
Use the str() and head() functions to interrogate the resulting R object. What class of object is it?
hadcrut = read.csv('https://raw.githubusercontent.com/andrewcparnell/tsme_course/master/data/hadcrut.csv')
head(hadcrut)
## Year Anomaly
## 1 1850 -0.412
## 2 1851 -0.237
## 3 1852 -0.262
## 4 1853 -0.335
## 5 1854 -0.258
## 6 1855 -0.255
str(hadcrut)
## 'data.frame': 166 obs. of 2 variables:
## $ Year : int 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 ...
## $ Anomaly: num -0.412 -0.237 -0.262 -0.335 -0.258 -0.255 -0.382 -0.484 -0.427 -0.381 ...
You can use a number of ways to index into a data frame (and other objects) in R. For example:
dat = hadcrut$Anomaly
dat = hadcrut[,2]
dat = hadcrut[,'Anomaly']
should all give the same result.
A nice way to refer to the data in object while keeping code easy to understand is using the with() function. For example, you can plot the data combining the plot() and the with() functions:
with(hadcrut,plot(Year, Anomaly))
Exercise 2
Is there a way to make the structure of the timeseries clearer? [Hint: look for the “type” option].
You can convert the Anomaly data to a timeseries object using the function ts() if you like. To keep things simple, you’ll need to use the start, end or possibly frequency attributes.
Is there a relationship between a value of the timeseries and the next time step? You should be able to see such a relationship in a lagged scatter plot (lag plot), plotting the value of the time series \(y_{i}\) against previous value \(y_{i-1}\).
Exercise 3
plot(x,y), where x and y are your variables.with(hadcrut,
plot(Anomaly[(1:length(Anomaly)-1)], Anomaly[2:length(Anomaly)])
)
R contains a great many useful statistical functions - examples for checking for autocorrelation are acf() and pacf(). We’ll come to these in a later module.
BONUS excercises
Apply a 30 year moving window to the hadcrut data set. What do you see? Is this a real signal?
Apply a similar window to an AR1 process (with drift). Do you see something similar? Fit a lowess (local regression smoother) to the HadCRUT data.
In this part of the practical, we will:
A similar linear model is written:
\(y = \alpha + \beta x_{i} + \epsilon\)
Where \(\alpha\) is a constant, \(\beta\) is the regression coefficient and \(\epsilon\) is gaussian noise with variance \(\sigma^{2}\), and is independent across observations:
\[\epsilon \sim N(0, \sigma^{2})\]
In R, we can write this as:
# Likelihood:
# y_t ~ N(alpha + beta * x[i], sigma^2)
# Prior
# alpha ~ N(0,100) - vague priors
# beta ~ N(0,100)
# sigma ~ U(0,10)
Here is some R code that you can run to simulate data from the linear model.
T = 50
alpha = 2
beta = 3
#sigma = 1
sigma = 3
# Set the seed so this is repeatable
set.seed(123)
x = sort(runif(T, 0, 10)) # Sort as it makes the plotted lines neater
y = rnorm(T, mean = alpha + beta * x, sd = sigma)
Exercise 4
plot(x, y)
lines(x, alpha + beta * x)
We’d like to fit a Bayesian model to the simulated data. Here is how we specify the JAGS model code in R:
model_code = '
model
{
# Likelihood
for (t in 1:T) {
y[t] ~ dnorm(alpha + beta * x[t],tau)
}
# Andrews Priors
#alpha ~ dnorm(0.0,0.01)
#beta ~ dnorm(0.0,0.01)
#tau <- 1/pow(sigma,2) # Turn precision into standard deviation
#sigma ~ dunif(0.0,10.0)
#Dougs priors
alpha ~ dnorm(0.0,3)
beta ~ dnorm(0.0,3)
tau <- 1/pow(sigma,2) # Turn precision into standard deviation
sigma ~ dunif(0.0,10.0)
}
'
We specify a likelihood and priors for each of the parameters. We specify a prior for sigma, but need to calculate precision (\(1/\sigma^{2}\)) for the likelihood.
Bonus exercise
We set up the data as a list object so that we can get it into the jags function.
# Set up the data as a list object, and inspect it
model_data = list(T = T, y = y, x = x)
str(model_data)
## List of 3
## $ T: num 50
## $ y: num [1:50] -2.32 5.78 3.83 1.67 9.93 ...
## $ x: num [1:50] 0.246 0.421 0.456 1.029 1.388 ...
We choose the model parameters that we’d like the function to watch and output results for.
model_parameters = c("alpha","beta","sigma")
Then we run the jags model with some appropriate choices.
model_run = jags(data = model_data,
parameters.to.save = model_parameters,
model.file=textConnection(model_code),
n.chains=4, # Number of different starting positions
n.iter=1000, # Number of iterations
n.burnin=200, # Number of iterations to remove at start
n.thin=2) # Amount of thinning
## module glm loaded
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 50
## Unobserved stochastic nodes: 3
## Total graph size: 214
##
## Initializing model
The jags function has created a sequence of samples from the posterior distribution of the parameters that we asked it to save.
Exercise 5
Use the names() function to examine the structure of the model_run object.
Use the plot(), print() and trace_plot() functions on your model_run object to check the output.
Are the true values inside the 95% CI (confidence interval)?
Look at the R-hat values - they need to be close to 1 if convergence has been achieved.
How close is the posterior mean regression line to the true value?
names(model_run)
## [1] "model" "BUGSoutput" "parameters.to.save"
## [4] "model.file" "n.iter" "DIC"
plot(model_run)
print(model_run)
## Inference for Bugs model at "5", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## alpha 1.113 0.496 0.145 0.767 1.131 1.435 2.043 1.002
## beta 3.089 0.099 2.888 3.027 3.088 3.155 3.277 1.000
## sigma 2.985 0.326 2.430 2.751 2.951 3.187 3.692 1.003
## deviance 250.010 3.238 245.340 247.582 249.589 251.754 257.576 1.003
## n.eff
## alpha 1400
## beta 1600
## sigma 730
## deviance 890
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.2 and DIC = 255.2
## DIC is an estimate of expected predictive error (lower deviance is better).
traceplot(model_run)
# Create a plot of the posterior mean regression line
post = print(model_run)
## Inference for Bugs model at "5", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## alpha 1.113 0.496 0.145 0.767 1.131 1.435 2.043 1.002
## beta 3.089 0.099 2.888 3.027 3.088 3.155 3.277 1.000
## sigma 2.985 0.326 2.430 2.751 2.951 3.187 3.692 1.003
## deviance 250.010 3.238 245.340 247.582 249.589 251.754 257.576 1.003
## n.eff
## alpha 1400
## beta 1600
## sigma 730
## deviance 890
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.2 and DIC = 255.2
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_mean = post$mean$beta
plot(x, y)
lines(x, alpha_mean + beta_mean * x, col = 'red')
lines(x, alpha + beta * x, col = 'blue')
legend('topleft',
legend = c('Truth', 'Posterior mean'),
lty=1,
col=c('blue','red'))
# Blue and red lines should be pretty close
Bonus exercises 1. With the simulated data, experiment with changing the value of N when creating the data? What happens to the posterior distribution of the parameters as N gets bigger?
Church and White (2011) estimated sea level rise from the late 19th to the early 21st century, splicing tide guage and satellite altimetry data together.
Load the data set using the following code:
sea_level = read.csv('https://raw.githubusercontent.com/andrewcparnell/tsme_course/master/data/church_and_white_global_tide_gauge.csv')
Exercise 6 1. What is the rate of sea level rise for the whole data set? Ignore the errors for the moment and run a linear regression model.
As a bonus, we’ll look at a logistic regression example. You’ll need the boot package for the logit transform.
library(boot) # Package contains the logit transform
Here is the R notation for the model that we’ll fit in this example.
# y_t = binomial (often binary) response variable for observation t=1,...,N
# x_{1t} = first explanatory variable for observation t
# x_{2t} = second " " " " " " " " "
# p_t = probability of y_t being 1 for observation t
# alpha = intercept term
# beta_1 = parameter value for explanatory variable 1
# beta_2 = parameter value for explanatory variable 2
# Likelihood
# y_t ~ Binomial(K,p_t), or Binomial(1,p_t) if binary
# logit(p_t) = alpha + beta_1 * x_1[t] + beta_2 * x_2[t]
# where logit(p_i) = log( p_t / (1 - p_t ))
# Note that p_t has to be between 0 and 1, but logit(p_t) has no limits
# Priors - all vague
# alpha ~ normal(0,100)
# beta_1 ~ normal(0,100)
# beta_2 ~ normal(0,100)
And here is some R code to simulate from the model.
T = 100
set.seed(123)
x_1 = sort(runif(T,0,10))
x_2 = sort(runif(T,0,10))
alpha = 1
beta_1 = 0.2
beta_2 = -0.5
logit_p = alpha + beta_1 * x_1 + beta_2 * x_2
p = inv.logit(logit_p)
y = rbinom(T,1,p)
Exercise 7
What is the effect of of x_1 and x_2 on y ?
plot(x_1,y)
plot(x_2,y) # Clearly when x is high y tends to be 0}
Here is the Jags code to fit the model to the simulated data.
model_code = '
model
{
# Likelihood
for (t in 1:T) {
y[t] ~ dbin(p[t], K)
logit(p[t]) <- alpha + beta_1 * x_1[t] + beta_2 * x_2[t]
}
# Priors
alpha ~ dnorm(0.0,0.01)
beta_1 ~ dnorm(0.0,0.01)
beta_2 ~ dnorm(0.0,0.01)
}
'
Exercise 8
# Set up the data
model_data = list(T = T, y = y, x_1 = x_1, x_2 = x_2, K = 1)
# Choose the parameters to watch
model_parameters = c("alpha", "beta_1", "beta_2")
# Run the model
model_run = jags(data = model_data,
parameters.to.save = model_parameters,
model.file = textConnection(model_code),
n.chains = 4,
n.iter = 1000,
n.burnin = 200,
n.thin = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 100
## Unobserved stochastic nodes: 3
## Total graph size: 711
##
## Initializing model
# Check the output - are the true values inside the 95% CI?
# Also look at the R-hat values - they need to be close to 1 if convergence has been achieved
plot(model_run)
print(model_run)
## Inference for Bugs model at "6", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## alpha 1.652 0.770 0.189 1.130 1.639 2.169 3.236 1.000
## beta_1 0.595 0.854 -1.120 -0.007 0.637 1.179 2.302 1.002
## beta_2 -1.018 0.949 -2.909 -1.671 -1.057 -0.341 0.861 1.001
## deviance 119.242 2.440 116.399 117.421 118.599 120.455 125.367 1.007
## n.eff
## alpha 1600
## beta_1 1400
## beta_2 1500
## deviance 530
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 3.0 and DIC = 122.2
## DIC is an estimate of expected predictive error (lower deviance is better).
traceplot(model_run)
# Create a plot of the posterior mean regression line
post = print(model_run)
## Inference for Bugs model at "6", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
## alpha 1.652 0.770 0.189 1.130 1.639 2.169 3.236 1.000
## beta_1 0.595 0.854 -1.120 -0.007 0.637 1.179 2.302 1.002
## beta_2 -1.018 0.949 -2.909 -1.671 -1.057 -0.341 0.861 1.001
## deviance 119.242 2.440 116.399 117.421 118.599 120.455 125.367 1.007
## n.eff
## alpha 1600
## beta_1 1400
## beta_2 1500
## deviance 530
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 3.0 and DIC = 122.2
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_1_mean = post$mean$beta_1
beta_2_mean = post$mean$beta_2
As we have two explanatory variables, we can create two plots holding one of the variables fixed whilst varying the other.
# As we have two explanatory variables I'm going to create two plots
# holding one of the variables fixed whilst varying the other
par(mfrow = c(2, 1))
plot(x_1, y)
lines(x_1,
inv.logit(alpha_mean + beta_1_mean * x_1 + beta_2_mean * mean(x_2)),
col = 'red')
plot(x_2, y)
lines(x_2,
inv.logit(alpha_mean + beta_1_mean * mean(x_1) + beta_2_mean * x_2),
col = 'red')
# Line for x_1 should be increasing with x_1, and vice versa with x_2
We’ll use adapted Moth mortality data from Royla and Dorazio (Chapter 2).
T = 12
K = 20
y = c(1,4,9,13,18,20, 0,2,6,10,12,16)
sex = c(rep('male',6), rep('female',6))
dose = rep(0:5, 2)
sexcode = as.integer(sex == 'male')
Exercise 9
# Set up the data
real_data = list(T = T, K = K, y = y, x_1 = sexcode, x_2 = dose)
# Run the mdoel
real_data_run = jags(data = real_data,
parameters.to.save = model_parameters,
model.file = textConnection(model_code),
n.chains = 4,
n.iter = 1000,
n.burnin = 200,
n.thin = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 12
## Unobserved stochastic nodes: 3
## Total graph size: 79
##
## Initializing model
# Plot output
print(real_data_run)
## Inference for Bugs model at "7", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha -3.360 0.515 -4.293 -3.685 -3.367 -3.059 -2.400 1.009 320
## beta_1 1.055 0.358 0.344 0.821 1.057 1.296 1.764 1.050 250
## beta_2 1.034 0.145 0.785 0.957 1.034 1.122 1.290 1.007 620
## deviance 40.210 5.821 37.052 38.039 39.093 40.774 47.123 1.008 980
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 16.9 and DIC = 57.1
## DIC is an estimate of expected predictive error (lower deviance is better).
# Create same plot as before (only for does though)
post = print(real_data_run)
## Inference for Bugs model at "7", fit using jags,
## 4 chains, each with 1000 iterations (first 200 discarded), n.thin = 2
## n.sims = 1600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha -3.360 0.515 -4.293 -3.685 -3.367 -3.059 -2.400 1.009 320
## beta_1 1.055 0.358 0.344 0.821 1.057 1.296 1.764 1.050 250
## beta_2 1.034 0.145 0.785 0.957 1.034 1.122 1.290 1.007 620
## deviance 40.210 5.821 37.052 38.039 39.093 40.774 47.123 1.008 980
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 16.9 and DIC = 57.1
## DIC is an estimate of expected predictive error (lower deviance is better).
alpha_mean = post$mean$alpha
beta_1_mean = post$mean$beta_1
beta_2_mean = post$mean$beta_2
# Look at effect of sex - quantified by beta_1
hist(real_data_run$BUGSoutput$sims.list$beta_1, breaks = 30)
# Seems positive - males more likely to die
# What about effect of dose?
o = order(real_data$x_2)
par(mfrow=c(1,1)) # Reset plots
with(real_data,plot(x_2, y, pch = sexcode)) # Data
# Males
with(real_data,
lines(x_2[o],
K*inv.logit(alpha_mean + beta_1_mean * 1 + beta_2_mean * x_2[o]),
col = 'red'))
# Females
with(real_data,
lines(x_2[o],
K*inv.logit(alpha_mean + beta_1_mean * 0 + beta_2_mean * x_2[o]),
col = 'blue'))
# Legend
legend('topleft',
legend = c('Males', 'Females'),
lty = 1,
col = c('red','blue'))